System and method for determining geo-location(s) in images

ABSTRACT

Determining GPS coordinates of some image point(s) positions in at least two images using a processor configured by program instructions. Receiving position information of some of the positions where an image capture device captured an image. Determining geometry by triangulating various registration objects in the images. Determining GPS coordinates of the image point(s) positions in at least one of the images. Saving GPS coordinates to memory. This system and method may be used to determine GPS coordinates of objects in an image.

CROSS-REFERENCE TO RELATED APPLICATION(S)

This application claims the benefit of U.S. Provisional PatentApplication No. 61/267,243 filed Dec. 7, 2009, the disclosure of whichis incorporated by reference herein.

BACKGROUND OF THE INVENTION

The present invention relates generally to position determination, andmore particularly determining the position, location or coordinates ofobjects, or points in an image with respect to a coordinate system otherthan the local coordinate system of the image itself.

Determining the position of various objects in an image is useful forunderstanding the distance between objects or possibly the absolutedistance between an object and another object that may not be in theimage. Positions of objects in an image are usually determined by takinga photograph of an area. In the photograph there is usually a referenceobject whose position is known. The position of the other objects in theimage is determined by computing the distance away from the referenceobject with the known position. The reference object with the knownposition enables the other objects to have known positions within thecoordinate system of the photograph as well as a coordinate system thatmay be outside of the photograph.

Another way of determining the approximate positions of objects issimply by GPS image tagging. A GPS enabled camera may be utilized totake photographs. Each photograph may be tagged with the GPS informationof the location of the camera when the photograph is taken. The GPSinformation can be associated with the captured image via a time stampof both the GPS information and the image or the GPS image can simply beembedded in the digital data of the photograph. Unfortunately. GPStagging only records the position of the camera when the photograph wascaptured. The GPS positions of objects in the photograph are stillunknown.

Another method of determining positions of objects is through surveying.Surveyors use optical instruments to locate and/or triangulate positionson the ground. For example, a surveyor may be required to survey theboundaries of a property. Usually, the surveyor will find the nearestmonument that has a known position. In this example, the monument may beseveral blocks from the property that may be required to survey. Once amonument is located, the surveyor uses optics and a human eye tophysically measure distances from the monument. Unfortunately, surveyingis prone to errors such as optical errors and human eye errors. Also,the surveying method only determines the position of a single point at atime with a great deal of effort.

Unfortunately, many of the systems and methods described above haveproblems. The position of an object may still be required even whenthere are no positions of reference objects available. Multiple pointsmay need to be determined simultaneously. Accuracy, greater than theerrors inherent in the human eye and optics may be required.

BRIEF SUMMARY OF THE INVENTION

The present invention relates generally to position determination, andmore particularly determining the position, location or coordinates ofobjects, or points in an image with respect to a coordinate system otherthan the local coordinate system of the image itself.

Position determination may be implemented in a general purpose computeror an embedded system, for example a digital camera, cellular handset,GPS receiver, or any combination of these devices. In one embodiment theinvention may determine the UPS coordinates of each pixel in an image.The pixel coordinates may be determined first by performing imageregistration on three or more images to determine corresponding pixels.Receiving the GPS coordinates at each camera position where each of thethree images were taken. Determining the rotations of the various cameracoordinate systems with respect to the ECEF coordinate system or WGS84coordinate system or other coordinate systems and translations of thecamera coordinate system with respect to the ECEF coordinate system orWGS84 coordinate system or other coordinate systems optical. Determiningthe GPS coordinates of each pixel in the image using the rotation andtranslation information. And finally, saving the GPS coordinates into amemory. In some embodiments the GPS coordinates of each pixel aredetermined with three or more images. In other embodiments, the GPScoordinates of each pixel are determined with a single image andinformation from an Inertial Measurement Unit (IMU).

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a method showing one aspect of the presentinvention, for determining position coordinates.

FIG. 2 is a block diagram showing one aspect of the present inventiondepicting a limited number of components constituting an exemplarysystem for implementing the present invention.

FIG. 3 is a diagram showing one aspect of the present inventiondepicting the geometry where at three separate positions image capturedevices capture images that include point P.

FIG. 4 is a block diagram of a method showing one aspect of the presentinvention for determining the GPS coordinates of a point(s) positions.

FIG. 5 is a diagram showing an alternative depiction of FIG. 3.

FIG. 6 is a block diagram of a method showing one aspect of the presentinvention for determining the GPS coordinates of point(s) positionscomprising trilateration.

FIG. 7 is a block diagram of a method showing one aspect of the presentinvention for determining the GPS coordinates of point(s) positionsusing GPS/IMU.

DETAILED DESCRIPTION

In the following description of the preferred embodiments of the presentinvention, reference is made to the accompanying drawings which form apart hereof, and in which is shown by way of illustration specificembodiments in which the invention may be practiced. It is understoodthat other embodiments may be utilized and structural changes may bemade without departing from the scope of the present invention.

With reference to FIG. 1, in block 101 the process determinesregistration between three or more images. Registration determines thecorresponding points, pixels, shapes, objects, lines, corners,neighborhoods, regions, projected images or any other type ofregistration object that can be depicted in an image or multiple images.In block 103 the process receives position information. In oneembodiment, receiving position information comprises receiving theposition information from each camera location where an image was taken.For example, a person with a camera with a built in GPS receiver maytake three pictures of a single area or object from three separatelocations. The GPS receiver tags each of the pictures or images with theGPS information from each location where the picture was taken. TheseGPS tags are received by the process. In other embodiments the positioninformation may be a relative position (location) of a receiver withrespect to transmission towers in a telecommunications system or in aradio frequency based system. In other embodiments receiving positioninformation comprises receiving GPS coordinates that may be WGS84 orECEF or derived from WGS84 or ECEF. In other embodiments, receivingposition information comprises receiving a location in any positioningsystem. In block 105 the process determines the geometry. In oneembodiment determining the geometry comprises using the positioninformation to determine distances and angles between the various cameralocations and corresponding registration objects in the images. Inanother embodiment determining the geometry comprises using the positioninformation to determine the rotations and translations between variouscamera locations and corresponding shapes in the image. In anotherembodiment determining the geometry comprises triangulating variousregistration objects in the captured images of the various cameras. Inanother embodiment determining the geometry comprises determining the3-D model associated with the distance to visible objects in thecaptured images. In block 107 the process determines the positioncoordinates. In one embodiment determining the position coordinatescomprises determining the coordinates with respect to a coordinatesystem external from the camera coordinate system of various shapes inthe picture or image. In block 109 the process saves the positioncoordinates. In one embodiment saving the position coordinates comprisessaving position coordinates to a memory. Each of the blocks may beperformed by the process in an order that is different than that whichis depicted in FIG. 1. Additionally, each of the blocks may be optionalin the overall process described in FIG. 1.

FIG. 2 depicts several components that may be used to implement thepresent invention. An image capture device 203 may be used in thepresent invention to capture images, pictures, photographs, JPEGs,Motion JPEGs, or video all referred to herein as images or image in thesingular. A position determination device 205 may be used in the presentinvention to determine the current position of the image capture device203 upon capturing an image. In one embodiment, a general purposecomputer 207 in the form of a conventional personal computer may be usedin the present invention to execute instructions. In other embodimentscomputer 207 may be a computer embedded in another type of system.

With reference to FIG. 2, 203 is an image capture device. In someembodiments, 203 is a digital camera that comprises an optical lens,memory that may be removable, controller for controlling the digitalfunctions of the camera, and a communications port for transferring thedigital images to another device or computer. In other embodiments, 203may be a video camera that comprises an optical lens, storage forstoring the captured video frames such as memory, tape, or optical disk,a controller for controlling the digital functions of the camera, and acommunications port for transferring the video frames to another deviceor computer. In other embodiments 203 may be a cellular phone handsetthat comprises an optical lens, storage such as a memory for storingcaptured video frames or images, an embedded processor for controllingthe functions of the cellular handset, a communications port fortransferring the captured video frames or images to another device orcomputer. In each of the embodiments for 203 described above the storageelement may be a memory that may be removable such as an SD card.Additionally, with reference to 203, the communications port describedin the above embodiments may be wired such as a USB port or mini USBport or any other wired communication technology or it may be a wirelessport such as Wi-Fi or Bluetooth or any other wireless communicationtechnology. In some embodiments, image capture device 203 may be mountedon an aircraft, land or sea vehicle, helmet, clothing, linear mechanicalstage, blimp, rotational stage, tripod, surveyor's tripod, rope, or anymoving platform. 205 is a position determination device. In someembodiments 205 is a GPS receiver. In other embodiments 205 may be anIMU. In other embodiments 205 may be a laser radar or electromagneticradar. In some embodiments 205 may be a local positioning system LPSreceiver. In some embodiments the position determination device 205 andthe image capture device 203 are combined into one device 201. 201 isindicated by a dashed line to indicate that in some embodiments theposition determination device 205 and the image capture device 203 maybe combined into one device 201. For example 201 may be a digital cameraor cellular phone handset that has an on board GPS receiver or LPSreceiver. In the embodiments where the image capture device 203 and theposition determination device 205 are not combined then a communicationscapability may exist between the two devices 213. This communicationscapability may exist to synchronize the position information with eachimage or video frame. Sometimes the image capture device 203 and theposition determination device 205 do not have a communicationscapability and the image or video frames are synchronized with clock ortiming information after the data has been stored in each deviceseparately. In this case where the image capture device 203 and theposition determination device 205 do not have a communication capabilitywith each other, each of these devices may be able to transfer itsinformation to a computer 207 or similar device (i.e. other type ofembedded system). 209 and 211 indicate that each device may be equippedwith a communications port that may be used to transfer information. 209and 211 may each be wired or wireless. In the embodiment where 209 and211 are wired these connections do not have to be permanently connectedto 207 but rather may be connected to 207 when information transfer isneeded. Synchronization of the information between the image capturedevice 203 and the position determination device 207 may occur incomputer 207. In some embodiments 207 is a general purpose computerequipped with a processor, memory, a display device, keyboard, inputdevice, USB port, Ethernet card, optionally other wired or wirelesscommunication ports, and instructions stored in memory and executed bythe processor. In other embodiments 207 is an embedded system thatincludes a processor, memory, wired or wireless communications port,display and instructions stored in memory and executed by the processor.In some embodiment's image capture device 203, position determinationdevice 205, and computer 207 may all be combined into one device 215. Inone embodiment, 215 may be a digital camera with an on board GPSreceiver and a processor capable of performing the processes describedin the present invention. In some embodiments computer 207 may be asingle chip such as an ASIC, FPGA, computing device, semiconductordevice or microprocessor embedded in a device that is capable ofperforming the processes described in the present invention. Asemiconductor device may comprise a single monolithic silicon typeintegrated circuit.

FIG. 3 depicts an example where an image or multiple images are capturedat three different positions O₁, O₂, and O₃. In some embodiments, Camera1, Camera 2, and Camera 3 refers to cameras at positions O₁, O₂, and O₃respectively. In other embodiments Camera 1, Camera 2, and Camera 3refers to the same camera that may have moved to each of the locationsO₁, O₂, and O₃. For example an aircraft or moving platform may beequipped with a camera and arrive at positions O₁, O₂, and O₃ anddifferent times. In one embodiment these three positions are related tothe optical centers of each of the cameras. In one embodiment each ofthe images captured includes the point in common P whose projection isin the image. Point P is a point in 3D space. Point P may alsocorrespond to a point in the images. In one embodiment the correspondingimage projections of the point P is determined through registration asdiscussed below. In other embodiments, point P may be a single point ineither of the set of points g₁, g₂, . . . , g_(n), not indicated in FIG.3. The projected point P in common is determined by using registration.The point set g₁ is a point set such that each point in the group can beviewed by camera 1, and the point has corresponding matching points insome other cameras in the group g₁. The point set g₂ is a point set suchthat each point in the group can be viewed by camera 2 but not camera 1,and it has corresponding matching points in some other cameras of thegroup g₂, and so on for other point groups. The point O is consideredthe origin or the center of the earth in Earth-Center, Earth-Fixed(ECEF) coordinates or also known as world coordinate system. The vectorX is the position of point P in ECEF, GPS or WGSS4 coordinates. Thevector x₁ is the position of point P in the local coordinate system O₁.The vector x₂ is the position of point P in the local coordinate systemO₂. The vector x₃ is the position of point P in the local coordinatesystem O₃. Local geometry refers to geometrical elements such as x₁, x₂,and x₃ described in local coordinate systems such as O₁, O₂, and O₃respectively. T₁ is the vector translation from the point O to the pointO₁ in the O coordinate system. T₂ is the vector translation from thepoint O to the point O₂ in the 0 coordinate system. T₃ is the vectortranslation from the point O to the point O₃ in the O coordinate system.Another way of describing T₁, T₂ and T₃ is simply that they are the ECEFcoordinates of the three camera positions O₁, O₂, and O₃. t₁₂ is thevector translation in O₁ coordinates from the point O₁ to O₂. t₁₃ is thevector translation in O₁ coordinates from the point O₁ to O₃. The dottedlines at each of the points O, O₁, O₂, and O₃ depict local coordinatesystems associated with each of these points. In some embodiments thex-y plane of the local coordinate systems for O, O_(f), O₂, and O₃ maycoincide with the image plane of the camera while the z axis isorthogonal to this plane. For example x₁ is the position of point P inthe local coordinate system O₁. This may be different from x₃ because itis the position of the same point P but in a different coordinate systemO₃. Each of these coordinate systems O, O₁, O₂, and O₃ may be rotatedand translated into different orientations with respect to each other.In one aspect of the present invention we determine X from each of threeimages taken at known positions O₁, O₂, and O₃. Not indicated in FIG. 3are the rotation matrices between each of the coordinate systemsmentioned above with origins at O, O₁, O₂, and O₃. R₁ is the rotationfrom coordinate system O to coordinate system O₁. R₂ is the rotationfrom coordinate system O to coordinate system O₂. R₃ is the rotationfrom coordinate system O to coordinate system O₃. R₁₂ is the rotationfrom coordinate system O₁ to coordinate system O₂. R₁₃ is the rotationfrom coordinate system O₁ to coordinate system O₃. FIG. 3 satisfies thefollowing geometric relationships:

x ₁ =R ₁(X−T ₁)  [1]

x ₂ =R ₂(X−T ₂)=R ₁₂(x ₁ +t ₁₂)  [2]

x ₃ =R ₃(X−T ₃)=R ₁₃(x ₁ +t ₁₃)  [3]

R ₂ =R ₁₂ R ₁  [4]

R ₃ =R ₁₃ R ₁  [5]

t ₁₂ =R ₁(T ₂ −T ₁)  [6]

t ₁₃ =R ₁(T ₃ −T ₁)  [7]

The vector x₁ is the description of point P in the local coordinatesystem O₁. Referring to FIG. 3, the vector O₁P is the difference betweenthe vectors X and T₁ in the 0 coordinate system. In order to arrive atthe vector x₁, the vector O₁P as described in coordinate system O, needsto be rotated by the rotation R₁ in the O coordinate system thusarriving at equation [1]. Equations [2] and [3] follow similar vectoralgebra deduction and will not be discussed further. Equation [4]describes the rotation R₂ as a composite rotation of R₁₂ and R₁.Similarly Equation [5] describes the rotation R₃ as a composite rotationof R₁₃ and R₁. t₁₂ is the vector translation in O₁ coordinates from thepoint O₁ to O₂. Referring to FIG. 3, the vector O₁O₂ is the differencebetween the vectors T₂ and T₁ in the O coordinate system. In order toarrive at the vector t₁₂ the vector O₁O₂ described in coordinate systemO, needs to be rotated by the rotation R₁ in the O coordinate systemthus arriving at equation [6]. Equation [7] follows similar vectoralgebra deduction and will not be discussed further.

According to equation [1], in order to determine the position of point Pin the 0 coordinate system, we need to solve equation [1] for X:

X=R ₁ ⁻¹ x ₁ +T ₁  [8]

Where R₁ ⁻¹ is the inverse matrix of the rotation matrix R₁.

X can be determined using equation [8] in which T₁ is a vector definedby the GPS coordinates of position O₁. In one embodiment, R₁ may bedetermined through t₁₂ and t₁₃. Additionally, in one embodiment x₁ maybe determined through relative rotations R₁₂ and R₁₃ as well astranslation t₁₂, t₁₃, and eight matching points or triples in each ofthree images.

The process described in FIG. 4 determines the GPS coordinates for aposition P where there at least three images that contain point P andseven other points in common captured from at least three separatepositions. Firstly, the process determines the values for R₁₂, R₁₃, t₁₂,t₁₃ using the eight points algorithm. The eight points utilized in thisalgorithm may or may not include point P. However, the points must existin each of the images taken at each of a minimum of three camerapositions. Next the process determines R₁ through relative geometry ofthree camera positions and orientations as well as three absolute GPSpositions of the cameras. Next the process determines the value for x₁in accordance with the camera equation which also utilizes geometricalquantities determined from the relative positions and absolute positionsof O₁, O₂, and O₃ and eight matching triples. An iteration process isutilized to increase the accuracy of the value of X using a bundleadjustment technique. Finally, the process saves the GPS coordinates forthe value X into a memory.

With reference to FIG. 4, in block 401 the process determinesregistration. In some embodiments, registration determines thecorresponding pixels, matching pixels, matching sub-pixels or sub pixelareas, or corresponding locations in the image plane that may exist inmultiple images. In some embodiments, registration is determined usingcorrelation. In other embodiments, registration is determined usingmatching local neighborhoods. In some embodiments registration isdetermined by matching one pixel to multiple pixels. In yet otherembodiments registration is determined by matching more than one pixelto other multiple pixels. Other methods of performing registration maybe used. In other embodiments, registration is determined using thesystems and processes described in U.S. Provisional Patent ApplicationNo. 61/230,679, titled “3D Matching” and is hereby incorporated byreference. In block 403 the process receives GPS information of at leastthree camera locations (positions). In some embodiments receiving GPSinformation comprises receiving GPS coordinates that may be WGS84 orECEF or derived from WGS84 or ECEF. In other embodiments, receiving GPSinformation comprises receiving GPS coordinates that may be coordinatesthat are derived from a relative position (location) of a receiver withrespect to transmission towers in a telecommunications system or in aradio frequency based system. In some embodiments, receiving GPSinformation comprises transferring GPS coordinate information for eachof the images captured from a GPS receiver or a camera with built in GPSreceiver to a memory. The memory may be located in a general purposecomputer or an embedded system. In other embodiments, receiving GPSinformation comprises transferring GPS coordinate information for eachof the images captured from memory to a processor for processing. Inother embodiments, receiving GPS information comprises a person enteringGPS coordinates by hand into a general purpose computer or embeddedsystem. In block 405 the process determines initial estimates of thegeometry between camera positions and orientations. In one embodimentwith three camera positions as in FIG. 3, the process determines initialestimates of the geometry between the camera positions and orientations.The geometry between the camera positions and orientations in thisembodiment comprises R₁₂, R₁₃, t₁₂, and t₁₃. The process determines theinitial estimates of R₁₂, R₁₃, t₁₂, and t₁₃ using the eight pointalgorithm. The points used in the eight point algorithm may be eighttriples that are determined from the registration process in 401. Inanother embodiment, the eight point algorithm may use eight pairs ineach pair of images that are determined from the registration process in401.

In one embodiment: a processor is configured by program instructions toperform the eight point algorithm in order to determine the initialvalues for R₁₂, R₁₃, t₁₂, and t₁₃. The eight point algorithm is given in“An Invitation to 3-D Vision” by Yi Ma et al (ISBN 0-387-00893-4)Springer-Verlag New York, Inc. 2004 which is hereby incorporated byreference.

In block 407 the process determines the initial estimates of point(s)positions in local camera coordinate(s) systems. In one embodiment aprocessor executes instructions to determine x₁ using the cameraequation as follows:

x ₁=λ₁ ^(P) ·u ₁ ^(P)  [9]

The camera equation is given in “An Invitation to 3-D Vision” by Yi Maet al (ISBN 0-387-00893-4) Springer-Verlag New York, Inc. 2004 which ishereby incorporated by reference. u₁ ^(P) which denotes the point P incamera 1 in its image plane coordinates is obtained by first obtainingpoint triple

${U_{1}^{P\;} = \begin{pmatrix}x \\y \\1\end{pmatrix}_{1}^{P}},{U_{2}^{P} = \begin{pmatrix}x \\y \\1\end{pmatrix}_{2}^{P}},{U_{3}^{P} = \begin{pmatrix}x \\y \\1\end{pmatrix}_{3}^{P}},{{where}\mspace{14mu} \begin{pmatrix}x \\y \\1\end{pmatrix}_{c}^{P}}$

are homogenous image pixel coordinates in pixel units of the point P incamera c. Camera calibration matrix K converts pixel coordinates toimage plane coordinates through u_(c) ^(P)=K⁻¹U_(c) ^(P), where c is thecamera 1, 2, or 3. u_(c) ^(P) denotes the point P in camera c in itsimage coordinates. The definition of K and obtaining K is given in “AnInvitation to 3-D Vision” by Yi Ma et al (ISBN 0-387-00893-4)Springer-Verlag New York, Inc. 2004 which is hereby incorporated byreference.Similarly, the camera equation for point P in the other cameracoordinates is as follows:

x ₂=λ₂ ^(P) u ₂ ^(P) =R ₁₂(x ₁ +t ₁₂)=λ₁ ^(P) R ₁₂ u ₁ ^(P) +R ₁₂ t₁₂  [10]

x ₃=λ₃ ^(P) u ₃ ^(P) =R ₁₃(x ₁ +t ₁₃)=λ₁ ^(P) R ₁₃ u ₁ ^(P) +R ₁₃ t₁₃  [11]

and more generally,

λ_(c) ^(P) u _(c) ^(P) =R _(1c)(x ₁ +t _(1c))=λ₁ ^(P) R _(1c) u ₁ ^(P)+R _(1c) t _(1c)  [12]

The cross product of camera equations [10] and [11] with itscorresponding vector representing point P in its image plane coordinates(e.g. u₂ ^(P) and u₃ ^(P) respectively) provides the following:

λ₁ ^(P) u ₂ ^(P) ×R ₁₂ u ₁ ^(P) +u ₂ ^(P) ×R ₁₂ t ₁₂=0  [13]

λ₁ ^(P) u ₃ ^(P) ×R ₁₃ u ₁ ^(P) +u ₃ ^(P) ×R ₁₃ t ₁₃=0  [14]

A processor is configured by program instructions to determine λ₁ ^(P)by the point depth equation as follows:

$\begin{matrix}{\frac{1}{\lambda_{1}^{P}} = {- \frac{\sum\limits_{c = 2}^{3}{\left( {u_{c}^{P} \times R_{1c}t_{1c}} \right)^{t}\left( {u_{c}^{P} \times {R_{1c} \cdot u_{1}^{P}}} \right)}}{\sum\limits_{c = 2}^{3}{{u_{c}^{P} \times R_{1c}t_{1c}}}^{2}}}} & \lbrack 15\rbrack\end{matrix}$

Equation [15] computes λ₁ ^(P) where the point P is seen in each of thethree images taken from positions O₁, O₂, and O₃ as seen in FIG. 3. Inother embodiments, λ₁ ^(P) may be determined for each point P that is anelement of either of the point sets g₁, g₂, . . . , g_(n). In this casethe point depth equation more generally is described in [16] below:

$\begin{matrix}{\frac{1}{\lambda_{1}^{P}} = {- \frac{\sum\limits_{c \in g_{1}}{\left( {u_{c}^{P} \times R_{1c}t_{1c}} \right)^{t}\left( {u_{c}^{P} \times {R_{1c} \cdot u_{1}^{P}}} \right)}}{\sum\limits_{c \in g_{1}}{{u_{c}^{P} \times R_{1c}t_{1c}}}^{2}}}} & \lbrack 16\rbrack\end{matrix}$

In block 409 the process optimizes the camera(s) geometry and thepoint(s) position. In one embodiment the optimization may be minimizinga re-projection error for the solution of R₁₂, R₁₃, t₁₂, and t₁₃. Are-projection error at the first iteration is given by a function ofdifferences of the positions of the point P in the image coordinatesystem of camera 1 when re-projected into the image plane of camera 2and camera 3 with the prior estimates of the R₁₂, R₁₃, t₁₂, t₁₃. In oneembodiment the error function for a single point P may be expressed as:

E=(re ₂(u ₁ ^(P))−u ₂ ^(P))²+(re ₃(u ₁ ^(P))−u ₃ ^(P))²  [17]

where re means re-projection from camera image 1 to camera image 2 orfrom camera image 1 to camera image 3. And these re-projections aredetermined using the prior estimates of R₁₂, R₁₃, t₁₂, and t₁₃. Ageneral bundle adjustment may be used to achieve each camera's rotationand translation vector relative to camera 1 by minimizing there-projection error:

$\begin{matrix}\begin{matrix}{E = {\sum\left( {{\overset{\sim}{u}}_{c}^{p} - u_{c}^{p}} \right)^{2}}} \\{= {{\sum\limits_{g_{1}}\left( {{\overset{\sim}{u}}_{c}^{p} - u_{c}^{p}} \right)^{2}} + {\sum\limits_{g_{2}}\left( {{\overset{\sim}{u}}_{c}^{p} - u_{c}^{p}} \right)^{2}} + \ldots + {\sum\limits_{g_{n}}\left( {{\overset{\sim}{u}}_{c}^{p} - u_{c}^{p}} \right)^{2}}}}\end{matrix} & \lbrack 18\rbrack\end{matrix}$

where ũ_(c) ^(P)=re_(c)(u₁ ^(P)) and c refers to camera 1, 2, 3, . . .and where g₁, g₂, . . . , g_(n) are point groups or sets of points withg₁ being a point set such that each point in the group can be viewed bycamera 1, and the point has corresponding matching points in some othercameras in the group g₁·g₂ is a point set such that each point in thegroup can be viewed by camera 2 but not camera 1, and it hascorresponding matching points in some other cameras of the group g₂, andso on for other point groups. And where p in [18] ranges through all ofthe points that are matched in corresponding groups g₁, g₂, . . . ,g_(n). The bundle adjustment described in equation [18] is iterateduntil the error is reduced to a predetermined threshold or otherstopping criteria. Once the bundle adjustment is complete R₁₂, R₁₃, t₁₂,and t₁₃ are known.

In block 411 the process determines the rotation between the camera(s)coordinate system(s) and the world coordinate system. In one embodimentdetermining the rotation between the camera(s) coordinate system(s) andthe world coordinate system is determining R₁. A processor is configuredby program instructions to determine R₁ by first determining an axis ofrotation of R₁ and then determining the amount of rotation around theaxis. A vector v may be used to express the direction of the rotationalaxis as well as the magnitude of the rotation. Whereby the rotationalaxis of the rotation R₁ is the direction of this vector. And the amountof rotation R₁ is the magnitude of this vector. We can find this vectorv and then use it to determine R₁. According to this embodiment wedefine the following vectors:

v ₁=(T ₁₂ ×t ₁₂)×(T ₁₂ +t ₁₂)  [19]

v ₂=(T ₁₃ ×t ₁₃)×(T ₁₃ +t ₁₃)  [20]

where T₁₂=T₂−T₁ and T₁₃=T₃−T₁. Equation [19] is a vector v₁ which alsorepresents a plane which is normal to the vector v₁. Any vector on thisplane can be a rotation axis that can rotate T₁₂ to t₁₂. Similarly, wealso have equation [20] which is a vector v₂ which also represents aplane which is normal to the vector v₂ representing a plane which is arotation axis that can rotate T₁₃ to t₁₃. The cross product of these twovectors is the vector v.

v=v ₁ ×v ₂  [21]

v is the rotation axis of our rotation matrix R₁. Next we need todetermine the amount of rotation or the length of the vector v. Thelength of vector v is determined by the following equation:

$\begin{matrix}{{{\cos (\Theta)} = \frac{s_{1} \cdot s_{2}}{{s_{1}} \cdot {s_{2}}}}{{{where}\mspace{14mu} s_{1}} = {{T_{13} - {\left( \frac{T_{13} \cdot v}{v} \right)\frac{v}{v}\mspace{14mu} {and}\mspace{14mu} s_{2}}} = {t_{13} - {\left( \frac{t_{13} \cdot v}{v} \right){\frac{v}{v}.}}}}}} & \lbrack 22\rbrack\end{matrix}$

The vectors s₁ and s₂ form a plane that is normal to the vector v.Therefore, v is the rotation axis of s₁ to s₂ where the amount ofrotation is represented in equation [22]. The amount of the rotation inmagnitude is the value of Θ in equation [22]. The same angle Θ alsorepresents a rotation between T₁₃ to t₁₃. We can define the finalrotation vector that is equivalent to

$\begin{matrix}{{\omega = {\frac{v}{v} \cdot \Theta}}{{where}\mspace{14mu} \frac{v}{v}}} & \lbrack 23\rbrack\end{matrix}$

is the direction of the vector and Θ is the magnitude of the rotation.We can also express the vector ω as

$\omega = \begin{pmatrix}v_{x} \\v_{y} \\v_{z}\end{pmatrix}$

and define the matrix

$\varpi = {\begin{bmatrix}0 & {- v_{z}} & v_{y} \\v_{z} & 0 & {- v_{x}} \\{- v_{y}} & v_{x} & 0\end{bmatrix}.}$

Next we can determine the matrix R₁ by using the Rodrigues formula for arotation matrix given in “An Invitation to 3-D Vision” by Yi Ma et al(ISBN 0-387-00893-4) Springer-Verlag New York, Inc. 2004 which is herebyincorporated by reference and also expressed in [21] below:

$\begin{matrix}{R_{1\;} = {I + {\frac{\varpi}{\omega } \cdot {\sin \left( {\omega } \right)}} + {\frac{\varpi^{2}}{{\omega }^{2}} \cdot \left( {1 - {\cos \left( {\omega } \right)}} \right)}}} & \lbrack 24\rbrack\end{matrix}$

where I is the identity matrix.

In block 413 the process determines the GPS coordinates for all pointsfor which correspondences were established through the registrationblock above. In one embodiment, a processor is instructed to perform thecomputation defined in equation [8] for each of the image pixels orsub-pixels using the final values determined from the previous processblocks in FIG. 4.

In block 415 the process saves the GPS coordinates of the point(s)positions. In one embodiment the process saves the GPS coordinates to amemory. In another embodiment the process may not necessarily save theGPS coordinates but rather the process simply transmits the GPScoordinates. In another embodiment the process sends the GPS coordinatesto a buffer or cache or another computing device.

FIG. 5 depicts the same three positions O₁, O₂, and O₃ as in FIG. 3. Inone embodiment images are captured at each of these positions. Thelabels having the same name in both FIG. 3 and FIG. 5 represent the sameelements. In FIG. 5 we depict the positions O₁, O₂, and O₃ as centers ofspheres S₁, S₂, and S₃ respectively where S₁ is a sphere with its centerat O₁, S₂ is a sphere with its center at O₂, and S₃ is a sphere with itscenter at O₃. Again as with respect to FIG. 3, Camera 1, Camera 2, andCamera 3 refers to cameras at positions O₁, O₂, and O₃ respectively inFIG. 5. In one embodiment these three positions are related to theoptical centers of each of the cameras. In one embodiment each of theimages captured includes the point in common P whose projection is inthe image. The vector X is the position of point P in ECEF, GPS or WGS84coordinates. The vector X is not shown in FIG. 5 but it is shown in FIG.3. Point P is a point is 3D space that represents the intersection ofthe three spheres S₁, S₂, and S₃. Point P may also correspond to a pointin the images. In one embodiment the corresponding image projections ofthe point P is determined through registration as discussed above. Inother embodiments, point P may be a single point in either of the set ofpoints g₁, g₂, . . . , g_(n), not indicated. As similarly discussedabove, the point set g₁ is a point set such that each point in the groupcan be viewed by camera 1, and the point has corresponding matchingpoints in some other cameras in the group g₁. The point set g₂ is apoint set such that each point in the group can be viewed by camera 2but not camera 1, and it has corresponding matching points in some othercameras of the group g₂, and so on for other point groups. The labelshaving the same name in both FIG. 3 and FIG. 4 represent the sameelements. The point O shown in FIG. 3 is not depicted in FIG. 5. Thevector X, not shown in FIG. 5, is the position of point P in ECEF, GPSor WGS84 coordinates. The vector x₁ is the position of point P in thelocal coordinate system O₁. Also the length of x₁ is the radius of thesphere S₁. The vector x₂ is the position of point P in the localcoordinate system O₂. Also the length of x₂ is the radius of the sphereS₂. The vector x₃ is the position of point P in the local coordinatesystem O₃. Also the length of x₃ is the radius of the sphere S₃. Thecircle C₁ depicts the intersection of spheres S₂ and S₃. The circle C₂depicts the intersection of spheres S₁ and S₃. The intersection ofcircles C₁ and C₂ depicts the point P. Thus, it follows from FIG. 5 thatthe point P can be determined as a point that belongs to theintersection of the spheres S₁, S₂, and S₃. If the intersection of thespheres S₁, S₂, and S₃ is determined in the world coordinate systemtherefore point P is determined in the world coordinate system.Trilateration will be used in order to determine the world coordinatesof the point P, Trilateration is the process by which the intersectionpoint is determined between three spheres given the coordinates of thecenters of the spheres and the radii of each of the spheres.

The process described in FIG. 6 determines the GPS coordinates for aposition P where there at least three images that contain point P andseven other points in common captured from at least three separatepositions. Firstly, the process determines the values for R₁₂, R₁₃, t₁₂,t₁₃ using the eight points algorithm. The eight points utilized in thisalgorithm may or may not include point P. However, the points must existin each of the images taken at each of a minimum of three camerapositions. Next the process determines the values for x₁, x₂, and x₃ inaccordance with the camera equations described in equations [9], [10],and [11] which also utilizes geometrical quantities determined from therelative positions and absolute positions of O₁, O₂, and O₃ and eightmatching triples. An iteration process is utilized to increase theaccuracy of the values R₁₂, R₁₃, t₁₂, t₁₃ using a bundle adjustmenttechnique. Next the process of trilateration is performed using theworld coordinates for O₁, O₂, O₃ and the lengths of x₁, x₂, and x₃.Finally, the process saves the GPS coordinates for the value X into amemory.

With reference to FIG. 6, in block 601 the process determinesregistration. Block 601 is the same as block 401 and will not bediscussed here further. In block 603 the process receives the GPScoordinates of the three camera locations. Block 603 is the same asblock 403 and will not be discussed here further. In block 605 theprocess determines initial estimates of the geometry between camerapositions and orientations. Block 605 is the same as block 405 and willnot be discussed here further. In block 607 the process determines theinitial estimates of point(s) positions in local camera coordinate(s)systems. Block 607 is the same as block 407 and will not be discussedhere further. In block 609 the process optimizes the camera(s) geometryand the point(s) position. Block 609 is the same as block 409 and willnot be discussed here further. In block 611 the process usestrilateration to determine the world coordinates of the point(s)positions. In one embodiment a processor is configured by programinstructions to determine the position of P (e.g. vector X) by theintersection point of three spheres S₁, S₂, and S₃. The processor maydetermine the position of P by utilizing as input the lengths of thevectors x₁, x₂, x₃ and the world coordinates of O₁, O₂, O₃. The lengthsof vectors x₁, x₂, and x₃ may be determined using equations [9], [10],and [11]. In some embodiments the process may yield more than onesolution. In this case the ambiguity may be resolved by utilizing afourth camera, Camera 4, located at position O₄ not included in FIG. 3or 5. In this embodiment the trilateration process will solve for theintersection of four spheres. In block 613 the process saves the GPScoordinates of the point(s) positions. In one embodiment the processsaves the GPS coordinates to a memory.

With reference to FIG. 7, in block 701 the process determinesregistration. Block 701 is the same as block 401 and will not bediscussed here further. In block 703 the process receives GPScoordinates of at least two camera locations and IMU information from atleast one of those camera locations. In some embodiments GPS coordinatesmay be WGS84 or ECEF or derived from WGS84 or ECEF. In other embodimentGPS coordinates may be coordinates that are derived from a relativeposition of a receiver with respect to transmission towers in atelecommunications system or in a radio frequency based system. In someembodiments, receiving GPS coordinates and IMU information comprisestransferring GPS coordinate and IMU information for each of the imagescaptured from a combined GPS receiver and IMU (GPS/IMU) device or acamera with built in GPS and IMU receiver to a memory or a buffer. Thememory may be located in a general purpose computer or an embeddedsystem. In some embodiments the GPS/IMU device acquires the GPSinformation and the IMU information non-synchronously. In this case theGPS and IMU information may be interpolated to achieve approximatesynchronization. In other embodiments, receiving GPS coordinates or IMUinformation comprises transferring GPS coordinate and IMU informationfor each of the images captured from memory to a processor forprocessing. In other embodiments, receiving GPS coordinates and IMUinformation comprises a person entering GPS coordinates and IMUinformation by hand into a general purpose computer or embedded system.In some embodiments IMU information comprises roll, pitch and yawinformation. In some embodiments a GPS device and a IMU device arecombined into a single device and referred to in the instant inventionas GPS/IMU device. However, in other embodiments a GPS device and a IMUdevice are separate devices.

In block 705 the process determines the boresight rotation andtranslation. In some embodiments the boresight rotation may be therelative rotation between the GPS/IMU device coordinate system and thecamera coordinate system. Also in some embodiments, the boresighttranslation may be the translation between the GPS/IMU device coordinatesystem and the camera coordinate system. Equation [25] below presentsthe relationship between the coordinates of point P in the GPS/IMUdevice coordinates and coordinates of the point P in the local camera 1coordinates O₁, using the elements from FIG. 3:

X _(gps/IMU) =R _(b) x ₁ +T _(b)  [25]

where R_(b) is the boresight rotational matrix, T_(b), is the boresighttranslational vector, x₁ is a vector of the position of point P in thelocal coordinate system O₁, and is the vector of the position of point Pin the GPS/IMU coordinate system.

In block 707 the process determines initial estimates of the geometrybetween camera positions and orientations. Block 707 is the same asblock 405 and will not be discussed here further. In block 709 theprocess determines the initial estimates of point(s) positions in localcamera coordinate(s) systems. Block 709 is the same as block 407however, since this embodiment at a minimum requires only two camerasfor example Camera 1 and Camera 2, equation [15], which determines λ₁^(P), is determined for c=2 (two cameras) only. Also x₁ is determined byusing equation [9]. In block 711 the process optimizes the camera(s)geometry and the point(s) position. Block 711 is the same as block 409and will not be discussed here further. In block 713 the processdetermines the local north, east, down (NED) coordinates. In oneembodiment a processor is configured by program instructions to performequation [26] below where the NED coordinates are determined.

X _(NED) =R _(y) R _(p) R _(r) X _(gps/IMU)  [26]

where X_(NED) is the vector of the position of point P in the local NEDcoordinate system, R_(y) is a yaw rotation matrix determined from theGPS/IMU device, R_(p) is a pitch rotation matrix determined from theGPS/IMU device, R_(r) is a roll rotation matrix determined from theGPS/IMU device, and X_(gps/IMU) is the vector of the position of point Pin the GPS/IMU coordinate system. In block 715 the process determinesthe GPS coordinates for all point(s) positions for which correspondenceswere established through the registration block above in 701. In oneembodiment a processor is configured by program instructions todetermine GPS coordinates using equation [27] below:

X _(ECEF) =R _(NED) X _(NED) +T _(ECEF)  [27]

where X_(ECEF) is the vector of the position of point P in ECEF worldcoordinate system, R_(NED) is the rotation matrix for rotating betweenthe NED coordinate system to the ECEF coordinate system, T_(ECEF) is thetranslation vector for translating between the NED coordinate system tothe ECEF coordinate system. R_(NED) and T_(ECEF) are determined from theGPS data provided by the GPS/IMU device. In block 717 the process savesthe GPS coordinates of the point(s) positions. In one embodiment theprocess saves the GPS coordinates to a memory. In another embodiment theprocess may not necessarily save the GPS coordinates but rather theprocess simply transmits the GPS coordinates. In another embodiment theprocess sends the GPS coordinates to a buffer or cache or anothercomputing device.

Sometimes a GPS/IMU device or standalone GPS device is not available. Inthis instance we utilize ground truth points. Ground truth points areobserved points in an image that have known GPS coordinates. When usingthe process depicted in FIG. 7, the IMU information is derived from theground truth points and their related geometry as is known in the art.When using the process depicted in FIG. 4, the GPS information isderived from the ground truth points as is known in the art.

There are situations where an observation or surveillance camera is usedto monitor an area. However, the GPS coordinates of theobservation/surveillance camera is not known. Yet we still want the UPScoordinates of the pixels and objects being observed in the camera. Inorder to obtain the GPS coordinates of various objects being viewed bythe observation/surveillance camera we first determine GPS coordinatesand the 3-D model of the 3-D scene that is being observed by thesurveillance camera, using another camera(s) whose GPS location isknown. By methods known in the art, calculate projection map from theobservation/surveillance camera to the 3-D model determined as above.Calculate intersection of the projection ray through each pixel of theobservation/surveillance camera with the 3-D model. Assign the GPS ofthe intersection point to the corresponding pixel of theobservation/surveillance camera. In this method, it is assumed that thepixels observations correspond to objects located against the GPSreferenced 3-D model. The applications of this process may includefinding GPS of subjects (people), objects (vehicles), including thereal-time observations from border security cameras, football stadiumsecurity cameras and public places security cameras.

1-4. (canceled)
 5. A method of determining GPS coordinates of some imagepoint(s) positions in at least two images using a processor configuredby program instructions comprising: receiving position information ofsome of the positions where an image capture device captured an image;determining geometry by triangulating various registration objects inthe images; determining GPS coordinates of image point(s) positions inat least one of the images; saving GPS coordinates to memory.
 6. Themethod of claim 5 where receiving position information comprisesreceiving GPS coordinates of at least three image capture devicepositions.
 7. The method of claim 5 also comprises determining the 3-Dmodel associated with the distance to visible objects in the capturedimages.
 8. The method of claim 5 where determining the GPS coordinatesof image points positions in at least one of the images comprises usingtrilaterations relative to the image capture device positions and thelocal geometry.
 9. The method of claim 5 where saving GPS coordinates tomemory comprises saving the GPS coordinates to a hard disk drive,register, buffer, optical disk, magnetic disk, random access memory, orportable memory device.
 10. The method of claim 5 whereby determininggeometry by triangulating various registration objects in the imagescomprises: determining initial estimate of geometry between imagecapture device(s) positions and orientations; determining initialestimates of point(s) positions in local image capture device(s)coordinate system(s); optimizing image capture device(s) geometry andpoint(s) position.
 11. The method of claim 5 whereby receiving positioninformation of some of the positions where an image capture devicecaptured an image comprises: using ground truth points to determine theposition where an image capture device captured an image.
 12. Acomputer-readable medium storing computer-readable instructions thatdirect a processor to: receive position information from each positionwhere an image capture device captured an image; determine geometry bytriangulating various registration objects in each image; determine GPScoordinates of pixel positions in at least one of the images; save GPScoordinates to memory.
 13. The computer-readable medium of claim 12whereby receive position information from each position where an imagecapture device captured an image comprises at least three image capturedevice positions.
 14. The computer-readable medium of claim 12 wherebydetermine GPS coordinates of pixel positions in at least one of theimages comprises using trilaterations relative to the image capturedevice positions and the local geometry.
 15. The computer-readablemedium of claim 12 whereby save GPS coordinates to memory comprisessaving the GPS coordinates to a hard disk drive, register, buffer,optical disk, magnetic disk, random access memory, or portable memorydevice.
 16. The computer-readable medium of claim 12 whereby determinegeometry by triangulating various registration objects in each imagecomprises: determining initial estimate of geometry between imagecapture device(s) positions and orientations; determining initialestimates of point(s) positions in local image capture device(s)coordinate system(s); optimizing image capture device(s) geometry andpoint(s) position.
 17. The computer-readable medium of claim 12 wherebyreceive position information from each position where an image capturedevice captured an image comprises: using ground truth points todetermine the position where an image capture device captured an image.18. A method of determining GPS coordinates of some image point(s)positions in at least two images using a processor configured by programinstructions comprising: receiving GPS and IMU of some of the positionswhere an image capture device captured an image; determining boresightrotation and translation; determining geometry by triangulating variousregistration objects in each image; determining NED coordinates;determining GPS coordinates of image point(s) positions in at least oneof the images; saving GPS coordinates to memory. 19-20. (canceled)